COMMEN.TEX[F87,JMC] - www.SailDart.org

perm filename COMMEN.TEX[F87,JMC] blob sn#850855 filedate 1987-12-28 generic text, type T, neo UTF8
% -*- Mode: Text; Base:10; Package: User; Default-character-style: (:FIX :BOLD :NORMAL) -*-
  
\documentstyle[12pt]{report}
\title{The 15-Puzzle Solver}
\author{David R Throop}
\date{\today}

  
\begin{document}
\maketitle

\chapter{Loading the PUZZLE System}

The program runs and solves all puzzles that I give it.

To load the system, type (to the prompt in a Lisp Listener):

\begin{verbatim} Command: Load System Puzzle \end{verbatim}

The appropriate files will be loaded.  To investigate the puzzle, set
the package with: 

\begin{verbatim} Command: Set Package Puzzle \end{verbatim}

and then 

\begin{verbatim} Command: (solve (random-board)) \end{verbatim}

The system will generate a random board position, and then solve it.
While the system is running, it gives an audible tone every time it
expands a node.  The higher pitched beeps are when it accepts a sequence
of moves.  To adjust the volume on the console, hold down (for several
seconds) either <local>-q (for quiet) or <local>-l (for loud).  It will
print out an intermediate board position every 1000 nodes, along with
some statistics about which heuristics have succeeded.

To edit the files, go to the ZMACS window and give the command:

\begin{verbatim} (M-x) Select System as Tag Table   \end{verbatim}

It will prompt for a system; answer PUZZLE. Then give the command:

\begin{verbatim} (M-x) Find Files in Tag Table \end{verbatim}

It will load all of the files, including some text files, into editor
buffers.  

\chapter{Different LISP Files}

The order in which the different files are loaded is the order in which
they are listed in the Puzzle system definition.  It is in the file {\em
Puzzle:puzzle;puzzle.system}.

\section{DataStructures.Lisp}

DataStructures contains the basic data structures of the system.  Both
the boards and the queue of unexpanded nodes are implemented as
DEFSTRUCTs.  

Boards store their positions as 1 dimensional arrays.  (I considered
using 2-d arrays, but since we're often accessing the things using the
tile number as an index, it actually seems simpler to use a 1-d array.
Because the positions are always accessed via a limited number of
accessor functions, this decision is still reversible.)

MOVES is the slot in a board which holds the moves which the board has
executed.  It is initialized to NIL, then each move is PUSHed onto it,
(so the list is the reverse of the sequence of moves, with the most
recent move first.)  Each move is a number, the number of the square
into which the blank is moved.

Boards also have slots for the current position of the blank, the
completed chain, the last complete row, and the side.  These things are
all implicit in the BOARD position, but are cached here.  There are
comments in the file {\em DataStructures.Lisp} at the definition of
BOARD which explain these a bit more.  The initial position of the blank
is also stored.  There is a comment about it at the definition of
PREVIOUS-MOVE.

The system has numerous boards defined with DEFPARAMETER.  Most of these
boards represent interesting positions to solve, but two boards are
actually used by the program every time it solves a puzzle.  These are
*Base-Board* and *Hidden-Board*.  There are more comments on these
below.

The *Queue* is a FIFO queue of nodes.  The *Queue* is never left empty -
when a node is accepted, the other nodes are flushed from the queue, and
that node is added.  

Each node represents a board position which is reachable from the
current state of the *Base-Board*.  Each node is represented by a list
of moves (in reverse order, as with the MOVES slot in the BOARDs.)  The
MOVES list of the *Base-Board* (when non-null) is a tail of all of the
nodes in the *queue*.  The position represented by the node could be
reached by executing the moves in the node on the original position.
Alternatively, the position could be reached by advancing the
*Base-Board* by all of the moves {\em between} the head of the node and
its tail, the moves that the *Base-Board* has already executed.  (There
is more commentary on this at MAKE-INTERVENING-MOVES.)

\subsection{Statistical Counters}

These keep track of the statistics of a solution search: \begin{itemize}
\item *Acceptances* - The solution is reached several moves at a time.
The program may search, say, five plies deep before it finds a position
BETTER than the current one.  It then accepts the node, increasing the
number of moves by 5.  *Acceptances* counts the number of nodes accepted
- the number of (possibly multiple-move) legs by which the solution is
reached.  *Acceptances* is equal to the number of times that the BETTER
function has succeeded on any node.

\item *Rejections* - This is the dual of *Acceptances*.  It is the
number of times that the WORSE function has succeeded on any node.

\item *Nodes-Considered*  - The number of nodes  generated.  It does
{\em not} equal the number of nodes expanded.  Each node is considered
(ie, the BETTER and WORSE functions are applied) as soon as it is
generated.  If it is neither BETTER nor WORSE, it is added to the queue
for later expansion, but *Nodes-Considered* counts it even if the queue
is flushed before it is reached.
\end{itemize}

\subsection{Tracing Forms}

*Acceptance-Trace* and TFORM are used to either generate a somewhat
detailed trace of the course of the solution, or to generate the tones
that audibly trace it.

\section{Boards-And-Queues.Lisp}

This file is made up of numerous accessor functions and a few
miscellaneous board manipulation functions.  It is extensively
commented.  

There are some comments about the definition of CONTIGUOUS at the
definition of the heuristic DONT-BREAK-CHAIN.

It is worth noting that BETTER and WORSE, when they succeed, return the
heuristic which succeeded as their values.  This is mainly used to
gather the statistics.

STORED-SUCCESSORS accesses the cache of moves which can be made with the
blank at any position.  It accesses the data structure
*STORED-SUCCESSORS*; this structure is only set now for a 4x4 board.

\section{Moving-Functions.Lisp}

This file contains the main calculational routines of the system and is
extensively commented.  The explanations of how these functions work are
in these comments and are not repeated here.  For an illustration of the
calling dependencies of these structures, see the file {\em
Calling-Dependencies.Text}.

Solve is the main function of the program - it calls all of the other
functions used to solve a puzzle.

\section{Heuristics.Lisp}

The tests used by BETTER and WORSE are stored on *BETTER-MEASURES* and
*WORSE-MEASURES*, respectively.  The heuristics are individually
discussed in the comments at the code; again, that discussion is not
repeated here.

\section{Testing.Lisp}

The functions in this file are not actually used to solve the puzzles.
However, some of them are called by the rest of the code.
CHECK-GOODNESS is called at initialization to check for malformed
boards.  SHOWBOARD is the main display function, the function that
prints out the boards and statistics while the problem is solving.
There are also several boards stored here that are useful for debugging
the system, or for checking to make sure that they system still works
after changes.

*STUCK* is an unsolvable board.  If the system succeeds on it, something
is fishy.

*SOLVED-BOARD* is a 4x4 board in solved position.  It is the default
position generated by MAKE-BOARD. 

RANDOM-BOARD generates a random, solvable board position.  It does this
by starting with a solved board and backing it up through a random path.
It uses the permanent board, *RANDOM-BOARD*, to receive that position.

\section{Weird-Code.Lisp}

This is just a bunch of function calls that test to make sure
subsections of the code are running OK.  I've found in past projects
that it is useful to save this stuff; if the code breaks later it saves
lots of typing.  But it is of no significance to the program.

\section{Puzzle.System}

The file {\em Sys:Site;Puzzle.System} contains the definition of the
PUZZLE package.  If you are unfamiliar with packages, check Steele's
{\em Common Lisp}.

The file {\em Puzzle:Puzzle;Puzzle.System} gives the system definition.

The definitions are spread out between the two {\em Puzzle.System}
files.  The Symbolics software always knows to look first in the
{\em Sys:site;} directory for a system file.  The actualy system
definition is kept in the {\em Puzzle:Puzzle;} directory because it
needed to be changed frequently as the system was devoloped.

\chapter{The Heuristics}

Currently there are two BETTER heuristics and four WORSE heuristics.
They were added to the system in this order:
\begin{itemize} \item Manhattan-Distance {\em (better)},
 \item Completed-Rows {\em (worse)},
\item Next-To-Last-Row {\em (worse)},
\item Dont-Break-Chain {\em (worse)},
 \item Achieve-Two-Rows {\em (better)}, and
\item Two-Row-Restriction {\em (worse)}. \end{itemize}
The program would sometimes succeed with only the first two heuristics
in place.  The overall performance improved with each addition; but as I
discuss below, in particular places it actually worsened.  I don't have
a good fix on the actual  metric - I was still turning up substantial
bugs in the programs while adding these heuristics.  I haven't gone back
to compare performance with fewer heuristics now that the program is
running well.

Again, this discussion is only a supplement to the discussion in the
comments in the code itself.

\section{BETTER Heuristics}

Currently there are two of these.

\subsection{Manhattan-Distance}

This is the basic BETTER heuristic.   The system performs tolerably well
with only this improvement heuristic.  Indeed, any real progress is
always reflected as an improvement in Manhattan-Distance; any other
improvement is movement towards a state where Manhattan-Distance can
improve.  There are two situations where such improvement is so distant
that this heuristic alone leads to excessive search.  The first is
illustrated by: 

The second is in the endgame, when the blank is on the last row, but the
last row is out of order.
\begin{figure}
\begin{verbatim}
   1     2     3     x
   x     x     x     4			; Manhattan-Distance flounders,
   x     x     x   :blank		; but Achieve-Two-Rows succeeds
   x     x     x     x			; in this situation



   1     2     3     4			; This is still a difficult 
   5     6     7     8			; position for the system to
   9    10    11    12			; solve.
:blank   x     x     x \end{verbatim}
\caption{Two Difficult Boards}\label{difficult}\end{figure}

\subsection{Completed Rows}

Don't break up completed rows.  The exception is if the last completed
row is the penultimate row.  It may have to be broken in order to rotate
the tiles in the last row.

\subsection{Next-To-Last-Row}

NEXT-TO-LAST-ROW is a weaker form of COMPLETED-ROWS.  It allows the
penultimate row to be broken, but not the row before that.  The test in
WHEN clause is only for efficiency - any other time that it succeeds,
COMPLETED-ROWS would also succeed.

\subsection{Dont-Break-Chain}

The comments in the code explain why this heuristic is not as strong as
it could be.  I also noted that, although this improved the overall
performance, it noticeably degraded the performance at the point where
the last tile is being added to complete a row.  This is presumably
because there is the shorter path where only the last two tiles in the
chain are kept contiguous.  When the system has to rotate out the entire
chain, it searches much longer.  This performance penalty would
presumably be much worse in a 5x5 puzzle.

What we may actually mean here is to not undo the chain at all {\em
except} to either move in the last tile in a row, or to rotate tiles in
the last row.

\subsection{Achieve Two Rows}

This heuristic only rarely succeeds.  It often only succeeds once.  But
because it breaks the situation noted in the first board shown in Figure
\ref{difficult} it makes a noticeable difference in the number of nodes
searched.

\subsection{Two-Row-Restriction}

Note the comments in the code for an English-language description of
what this heuristic does.  I was surprised - although this heuristic does
seem to reduce the total number of moves made, it seems to increase the
number of nodes searched.  This is because of the situation in Figure
\ref{block}.  Without this heuristic, the blank would travel down
through row 4 and around the 8-tile, and then move the 8-tile into
square 10.  With this heuristic in place, the 5-6-7 chain is rotated.
This leads to a shorter final solution, but because the rotation
combination involves more steps, it takes markedly longer to find it.

\begin{figure}\begin{verbatim}
   1     2     3     4			; Two-Row-Restriction actually
   5     6     7     x			; increases the amount of search
:blank   8     x     x			; needed to get out of this 
   x     x     x     x			; position.  \end{verbatim}
\caption{The Blank is Blocked by Tile 8}\label{block}\end{figure}

\end{document}